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Summary 

We examined the perceptual impact of plant 
noise parameterization for Kalman Filter predic- 
tive compensation of time delays intrinsic to 
head tracked virtual environments (VEs). Sub- 
jects were tested in their ability to discriminate 
between the VE system's minimum latency and 
conditions in which artificially added latency was 
then predictively compensated back to the sys- 
tem minimum. Two head tracking predictors 
were parameterized off-line according to cost 
functions that minimized prediction errors in ( 1 ) 
rotation, and (2) rotation projected into 
translational displacement with emphasis on 
higher frequency human operator noise. These 
predictors were compared with a parameteriza- 
tion obtained from the VE literature for cost 
function (1). Results from 12 subjects showed 
that both parameterization type and amount of 
compensated latency affected discrimination. 
Analysis of the head motion used in the param- 
eterizations and the subsequent discriminability 
results suggest that higher frequency predictor 
artifacts are contributory cues for discriminating 
the presence of predictive compensation. 

Introduction 

Predictive compensation has been widely consid- 
ered as a means for mitigating the consequences 
of the time delays inherent to sensors, computa- 
tion, and display rendering in virtual environ- 
ment (VE) systems (Liang, Shaw, & Green, 1991; 
Friedmann, Stamer, & Pentland, 1 992; Azuma & 
Bishop, 1994; Mazuryk & Gervautz, 1995; 

Nelson, Hettinger, Haas, Russell, Warm, Dember, 

& Stoffregen, 1995; Wu & Ouhyoung, 1995; 
Zikan, Curtis, Sowizral, & Janin, 1995; So & 
Griffin, 1996; Kiruluta, Eizenman, & Pasupathy, 
1997; Akatsuka & Bekey, 1998). While visually 
mediated manual tracking experiments in VEs 
have demonstrated the potential benefit of pre- 
dictive compensation for human performance 
(Wu & Ouhyoung, 1995; Nelson et al., 1995), 


the direct perceptual impact of such compensa- 
tion has never been examined. 

Predictive compensators for VEs operate on cur- 
rent position and orientation measurements to 
extrapolate future positions and orientations 
based on kinematic and dynamic models of 
motion via Kalman filtering or other techniques. 
Perfect prediction, in principle, should remove all 
effects of sensor-to-display latency and intro- 
duce no additional artifacts (e.g., noise or over- 
shoot) to the VE system. Notwithstanding, we 
presume that a practical predictor need neither 
remove all latency nor avoid all artifacts, it need 
only avoid making the user aware of those 
compensator imperfections. 

An important element of Kalman Filter (KF) 
predictor implementation is parameterization of 
the KF components. One approach has been 
numerical optimization to find parameter sets 
that minimize RMS error between the input body 
pail motion and predicted VE output (Liang et 
al., 1991; Azuma & Bishop, 1994; Mazuryk & 
Gervautz, 1995; Kiruluta et al., 1997). Parame- 
ters can also be chosen from estimates or models 
of sensor and human motion characteristics 
(Liang et al., 1991; Friedmann et al., 1992; 
Kiruluta et al., 1997). 

Predictor performance has also been evaluated 
on the basis of simple RMS error metrics 
(Azuma & Bishop, 1994; Kiruluta et al., 1997; 
Akatsuka & Bekey, 1998) as well as less rigor- 
ously from the cursory appearance of co-plotted 
motion and prediction traces (e.g., Liang et al., 
1991; Friedmann et al., 1992). However, as can 
be observed from sample plotted measurement 
and prediction records in the cited prior studies, 
these parameterization and evaluation methods 
may imply good performance while still failing 
to capture fully the undesirable noise and over- 
shoot artifacts introduced by prediction. Conse- 
quently, we propose that these prior simple met- 
rics are incomplete indicators of the user's 
perceptual experience and are therefore insuffi- 
cient for ascertaining the impact of predictive 
compensation on human performance in VEs. 



This paper presents new work in two areas for 
VE predictive compensator design. One is an 
investigation of more sophisticated cost functions 
for optimizing KF predictor parameters. The 
second is an experimental method for subjective 
evaluation of candidate predictor parameteriza- 
tions. The remainder of this paper begins with 
an overview of the Kalman Filter predictor for- 
mulation and a discussion of parameters avail- 
able for optimizing VE head motion prediction. 
Observations of predictor output from pre- 
recorded motion data leads to candidate 
parameter optimization strategies. A perceptual 
experiment to compare the subjective dis- 
criminability of different KF predictor param- 
eterizations is described next. Finally, 
implications of the predictor optimization analy- 
sis and the discriminability experiment results are 
discussed. 


Kalman Filtering and Prediction 

K alman Filter Estimation 

A general linear continuous-time dynamic proc- 
ess to describe head (or other body part) motion 
is 


A(/), evaluated at the beginning of the time 
interval, /,: 


<Pik) = e hM,t> (Eq. 3) 


Measurements reported by the sensor are 
given by 


y(k) = C(k)x(k) + \(k) (Eq. 4) 


C(/c) combines the states contributing to the 
measurement vector y(k). v( k ) is a noise vector, 
again assumed to be zero-mean, Gaussian, and 
uncorrelated between measurement channels, that 
contributes to random variations in sensor 
output. One typical source of sensor noise in 
VEs is the electromagnetic interference associ- 
ated with tracker induced image jitter. 

Given the discrete-time expressions for the head 
tracking and measurement processes of Eqs. (2) 
and (4), the equation 


x(k + 1) = 4K k)x(k) 

+ K(k + l)[y(* + 1) - C(k + l)^Hk)x(k)] 


(Eq. 5) 


employs an observer, + * *. to update an 
estimate, of the state vector x(t > . 


— = A(/)x(/)+ w(/) (Eq. 1) 

dt 

where x( t ) is the vector of system state variables. 
At/) is the system matrix describing the dynamic 
relationships between the state variables, and 
w(/) is the driving plant noise (assumed zero- 
mean Gaussian and uncorrelated between states). 
Plant noise comprises random inputs that drive 
displacements and their higher order time 
derivatives. For a head tracked VE system, these 
inputs include low-frequency volitional com- 
mands as well as involuntary, higher frequency 
signals such as those that might be driving head 
tremors. 

When sampled at uniform time interval h, Eq. 
( 1 ) is rewritten as the discrete system 

x(k + 1) = QHk)x(k)+ v/(k) (Eq. 2) 

in which k represents the sampled values at time 
t = t k = kh. The state transition matrix, QXk), is 
the matrix exponential of the scalar sampling 
interval, h, multiplied by the system matrix. 


The optimal observer, i.e., the Kalman Filter gain, 
is calculated from 

K(A + l) = P<A + l)C(A + l) r x 

. , (Eq- 6) 

[C (it + 1)P(A + 1)C (k + l) r + \(k + 1)J 

where V(/: + l) j s the covariance matrix of the 
sensor noise, + and + ^ is the covari- 
ance matrix of the state estimator error, 
x(k + 1) - x(k + 1) ( x denotes matrix 

multiplication.). 

The error covariance, P( k + 1), is propagated 
from the previous time step according to 

PU- + 1) =<S*k)V{kmk) T + W(it) 

-<tHk)V(k)C(k) T ^ 

x[C(A-)P(A)C(A) r + V(A)] ' ' ' 

x C(k)V(kWk) T 

where W(k) , s the covariance matrix of w (^). 
After substituting for 



P(£ + l) = 

<W*)[I - K(Jt>C(it>]P(/L)«KJt> r + W (A) (Eq ' 8) 

Either of Eqs. (7) and (8) — or one of several 
other equivalent forms — may be selected based 
on computational performance (Brown & 
Hwang, 1997, p. 219). 

For the work described below, both plant and 
sensor noise covariance, as well as the measure- 
ment matrix will be considered time 
invariant — i.e., V(A) = V and W(A) = W, and 
C(A) = C. This leaves the KF gain matrix, K(A), 
dependent only on updates of the error covari- 
ance, P(A) , and the state transition matrix, 4>(A). 

Thus the KF procedure of Eqs. (5) to (8) repre- 
sents an algorithm to compute an estimate of the 
state vector from sensor measurements, y (A), 
such that the expected value of the error, 
x(A) — x(A), between estimated and actual states is 
minimized in the least squares sense. 

Prediction 

From the solution over a single time step for the 
differential equation in Eq. ( 1 ) provided by of 
Eq. (3), and given that the expected value of the 
noise w(A) is zero, the extrapolation from time 
t - t k out to t -- 1, + t of the current state vector 
estimate, x(A), is expressed likewise by 


x(t k + r) = e™' 1 - x(t k ) (Eq. 9) 


sian component, (x, v,c), by its position, />, 
velocity, v, and acceleration, a. Beginning with 
simple kinematic derivatives in the a 
component, 
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(Eq. 12) 


vve note that only the noise component w aK . can 
affect the acceleration state a x . The state vector 
for all three translational components, x ? = 

(P r v r a r p ,v v ,a v , p , v % o ) r , has system matrix, 

(Eq. 13) 

such that 

dx 

= A,x, +w, (Eq. 14) 

dt 
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where w ; is the vector of translational noise 
components. 

Rotational displacements in our system are 
described in terms of the unit quaternion (e.g., 
Kuipers, 1999) 


In the special case that the extrapolation is across 
an integral number of sample steps r = Nh, 

Eq. (3) shows that 


rA </. > 

r> k 


Mi A If, ) -JV.., 
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(Eq. 10) 
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(Eq. 15) 


Therefore, when predicted N steps ahead, the 
state estimate is 


where 9 is the twist angle about the instantaneous 
rotation (i.e., Euler) axis u,. = (w v i,K v .j,w_k)' . 


x(A + N) = 4>' V (A)x(A) (Eq. 11) 

State Space Model 

Following Friedmann et al. (1992), Azuma and 
Bishop (1994), Mazuryk and Gervautz (1995), 
Kiruluta et al. (1997), and others, a simple, 
purely kinematic model is selected to describe 
the interrelation of motion states. Translational 
motion is described independently in each Carte- 


The rotational velocities are to, = (a) v i,&> v j,<w_k)' 
and have as their respective time derivatives the 
accelerations a = (a v i,a v j,a.k)‘ . From Chou 
(1992), the quaternion rate and the rotational 
velocities are related by 

d q 

= 0.5q r 00) (Eq. 16) 

dt 

After carrying out the quaternion multiplication 
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denoted by ®, Eq. (16) can be restated as either 
of the matrix multiplications 


< K 

dt 


= 0.5Q,©, =0.54q, 


(Eq. 17) 


in which 


Q = 
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“tf; 
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(Eq. 18) 


and 
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(Eq. 19) 


Once the nonlinear product terms in Eq. (17) are 
linearized locally about the states' instantaneous 
value at / = t k , the full rotational system model 
for the 10 element state vector, x , can be stated 


as 


d_ 

dt 
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+ w. 


(Eq. 20) 


= A x + w„ 


Since the dynamic systems in Eqs. (14) and (20) 
are independent of each other, the translational 
and rotational components are treated individu- 
ally as separate predictors. One advantage in 
separating the translational and rotational prob- 
lems is that the system matrices to be manipu- 
lated are smaller, with fewer zero entries. 

Another advantage is that the different dynamic 
system structures of Eqs. (14) and (20) require 
wholly different estimation and prediction pro- 
cedures. On one hand, the time-invariant trans- 
lational system in Eq. (14) produces a time- 
invariant state-transition matrix, $»,(£) = < i>, which 
yields a simpler steady-state formulation for the 
Kalman Filter of Eqs. (5) to (7) and the 
extrapolation of Eq. (11). On the other hand, 
because Eq. (17) is a nonlinear function of 
instantaneous state values, the linearized rota- 
tional system matrix A r does vary and therefore 
must be updated regularly to compute the 
instantaneous matrix exponential 4>(A). This 
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repeated model re-parameterization makes the 
rotational component estimator a so-called 
“Extended Kalman Filter” (e.g., Gelb, 1979). 

Noise Parameterization for Head 
Tracking 

The system model dynamics of the two KF esti- 
mator-predictors, and 4> r (fc), are built solely 
from expressions defining the time derivatives of 
translational and rotational displacement. As 
such, Eqs. (12) and (20) are purely kinematic 
models; they cannot be re-parameterized to 
reflect dynamic properties such as damping, 
bandwidth, or more complex neuromotor control 
elements. This leaves only the sensor and plant 
noise covariance matrices V and W as design 
parameters that might be tuned to adjust KF 
predictor performance. 

V and W may be identified in two general 
ways. The covariances can be predefined from 
prior calibration or from validated analytic 
models. Otherwise, if these driving noises cannot 
be well characterized by measurement or analy- 
sis, the covariance matrices (like any other tun- 
able KF model quantity) can be parameterized 
through numerical optimization of estimator or 
predictor performance against a cost criterion. 
Because VE sensor noise parameters can be 
measured by standard engineering techniques, 
we predefined V and chose to investigate more 
generally the parameterization of plant noise, W, 
which at a fundamental level arises from 
physiological muscle activity or neural signals, 
and is therefore not easy to measure or model 
analytically. 

Following Azuma and Bishop (1994), sensor 
noise covariance matrices for the translational 
and rotational predictors, V, and V r , were 
respectively set to be diagonal with 0.01 m 2 for 
the three Cartesian and 0.0001 (dimensionless 
units) for the four quaternion components. 
Despite Azuma and Bishop's (1994) sensors 
being based on a completely different technol- 
ogy, off-line tests for our system's sensors and 
environment showed little KF predictor sensitiv- 
ity to changes in V, and \ r parameterization 

As in Azuma and Bishop (1994) and Liang et al. 
(1992), we used Powell’s method, a multi-dimen- 



sional direction set optimization algorithm (Press, 
Flannery, Teukolsky, & Vetterling, 1986) to 
parameterize W, and W r . The procedure opti- 
mizes parameters of a system of equations with 
respect to the numerical performance of a func- 
tional cost criterion selected by the designer for a 
given input data collection. In our case, the 
input data (with the specific exception noted 
below) were position and quaternion 
measurements recorded from a single subject 
performing the same head and body movements 
that would subsequently be required during the 
discrimination study described below. The algo- 
rithm output throughout was constrained to 
produce diagonal (i.e., uncorrelated between 
states), positive semi-definite covariance 
matrices — negative auto-covariance has no 
physical meaning and could cause unstable KF 
behavior. 

The translational plant covariance w, was opti- 
mized using the same cost function as Azuma 
and Bishop (1994): the simple RMS difference 
over the sample interv al between predicted and 
actual measured position in each of the three 
Cartesian coordinates. Informal analytic and 
subjective observation showed limited predictor 
sensitivity to W, parameterization. We therefore 
focused our investigation on parameterization of 
W r . 

One significant difference between our and 
Azuma and Bishop's implementations (and 
therefore also the parameter search) is that we 
employ discrete-time state transition matrices for 
updating both our translational and rotational KF 
predictor designs; Azuma and Bishop used 4th 
order Runge-Kutta numerical integration to 
propagate their system. Informal observation 
from simulations with Azuma and Bishop's data 
sets indicate that our state transition matrix 
approach yielded smaller RMS errors — possibly 
due to the additional dynamics introduced by the 
Runge-Kutta technique. 

Twist Optimization 

The first of three candidate W ; parameterization 
cost functions considers RMS magnitude of the 
twist angle, AQ , over the sampled movement 

history (/ = 1 ,...,jV) between actual and predicted 
head orientations as in (Azuma & Bishop. 1994). 


=q“ ®(q'')"' (Eq. 21) 

gives the quaternion difference between actual 
( q“) and predicted (qf) head orientation 
quaternions. (Kuipers, 1999) The twist angle is 
calculated by 

Ad. = 2cos"'(^ J ) (Eq. 22) 


in which q u . is the scalar component of q J . The twist 
cost function is simply 



(Eq. 23) 


VE Object Displacement Optimization 

While angular error would be expected from the 
projective geometry of rotations in a head 
tracked VE to be an important indicator for pre- 
diction accuracy, Eq. (22) does not account for 
the twist error’s direction. The cost function of 
Eq. (23) consequently does not permit selective 
weighting of errors in preferred directions. 

Thus, we consider the translational displacement 
components projected by the prediction error, 
qf . of Eq. (21). 

Defining r = r i + r j + rk to represent the vector 
from a point on the head mounted display 
(HMD) to some spatial location or “object of 
interest” in the VE, the translational displace- 
ment of this object due to error in predicting 
head orientation is given by the quaternion ver- 
sion of rotational transformation (Kuipers, 1999) 

if =qf ®r®(qf)~‘ (Eq. 24) 

where the inverse is identical to the conjugate for 
the unit quaternion q“ . if = Av,i + Ay,. j + Ac,k , in 
effect, gives the difference between the predicted 
and actual location of a rendered object in the 
VE due to q, J as seen through the HMD. We 
note that once an “object of interest” in the VE 
is selected, the length scaling provided by Eq. 

(24) allows direct comparison between transla- 
tion and orientation induced predictor errors. 
These comparisons generally support the obser- 
vation for the dominance of orientation over 
translation effects in head tracking predictor 
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accuracy (Zikan et al., 1995), especially for the 
experimental task described below. 

The cost function minimizing errors according 
to the components of is 



where a, /?, and y represent weighting factors to 
selectively penalize error components in a Carte- 
sian coordinate frame fixed to the HMD. For the 
case in our VE system where the y and z com- 
ponents lie in the plane of the HMD, ft and y 
are weighted equally. From trigonometry, small 
twist angle errors produce negligible x variation, 
making a less important. The “object of 
interest” for the parameterization is the virtual 
target sphere in the experiment described below, 
situated at r, 0.8 m in the x direction. 

Noise Frequency Content Optimization 

Predictive compensation, particularly when based 
on the purely kinematic model described above, 
exacerbates higher frequency image jitter in VEs, 
regardless of whether the source is sensor or 
plant noise. This exacerbation is due to the dif- 
ferentiator action needed to generate velocity 
and acceleration state estimates from displace- 
ments. Figure 1 shows samples of increased jitter 
for 7 mjjr optimized predictor output in the 5 Hz 
band and beyond as the look-ahead time (and 
therefore the amount of differentiator action) is 
increased for a subject yawing his head side-to- 
side while wearing the HMD. This 5+ Hz spec- 
tral band remains present when the subject stops 
volitional movement and sits still but vanishes 
when the HMD and sensor are removed from the 
subject and supported instead on an inanimate 
fixed base. Because of its frequency content, we 
believe this human generated activity is associ- 
ated with involuntary normal (so-called 
“physiological”) tremor of the head and body 
(e.g., Desmedt, 1978). Under usual real envi- 
ronment conditions (or in an ideal VE free of 
sensor noise), the vestibulo-ocular consequences 
of one's own physiological tremor are sub- 
threshold and not visually detectable by the 
human. However, under KF prediction, we pro- 
pose that differentiation elevates tremor induced 
image jitter to the point where it is clearly 
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visible and interferes with perceived image 
stability. 

To penalize high frequency jitter amplification 
through parameterization of W r , an optimization 
can be carried out on a high-pass filtered version 
of a pre-recorded motion history consisting pri- 
marily of measured components that we want the 
KF predictor response to suppress. The cost 
function to minimize jitter amplification in the 
HMD, similar to Eq. (25), is 

..IK* 1 • 1 • ! I 

where <$y,, and &, are the high frequency KF 
predictor orientation output projections into 
Cartesian coordinates from Eq. (24) that should 
be driven to zero, a, j 3, and y represent the 
same directional weighting factors as in Eq. (25). 
Although not examined in this work, another 
reasonable alternative is a re-structuring of the 
state matrix (k) to include low-pass filtering 
which could attenuate high frequency jitter. 

Optimization applied to Eq. (26) alone yields a 
predictor that has good noise attenuation char- 
acteristics but poor tracking performance 
because of very low gains in KF matrix K (fc). 

To allow better response for low frequency voli- 
tional activity, the image displacement cost of 
Eq. (25) is reintroduced into a hybrid function 
that combines overall tracking and noise 
attenuation criteria. Because tracking accuracy 
can come at the expense of increased KF 
predictor noise, the hybrid cost function, 
formulated as 

J =a J T - + b ~V- (Eq. 27) 

** / y 

disp l default ,,oise ' default 

allows trade-offs between a and b— respectively 
the relative importance attached to tracking 
accuracy, J dap , and noise reduction, J mine . The 
denominators of each term are scale factors — the 
respective cost functions evaluated at the default 
parameter values reported in Azuma and Bishop 
( 1994 ) — serving to make the weighting factors a 
and b meaningful. 



frequency (Hz) 


Figure 1 . Increasing power spectral densities for 
vertical displacement projected from head rota- 
tion as prediction interval is increased from 0 to 
100 ms in steps of 33 ms. 

Predictor Analyses 

Based on extensive off-line numerical simula- 
tions, three different predictor parameterizations 
were ultimately selected for experimental study. 
The default and twist parameterizations are both 
based on the cost function of Eq. (23). The 
hybrid parameterization is from Eq. (27). The 
default parameterization uses Azuma and 
Bishop’s (1994) exact W, and W. values. W f 
for the twist and hybrid parameterizations and 
W r for the twist and the displacement portion 
( J ,, h „ ) of the hybrid parameterizations were 
optimized for a prediction look-ahead of 50 ms 
from a pre-recorded sample set (-20 s of data 
recorded at 120 Hz) of the side-to-side human 
head motion that would later be required of the 
subjects in the experiment described below. The 
noise portion, J mnr , of the hybrid design, how- 
ever, was based separately on a high pass filtered 
(2nd order, breakpoints at 0.5 and 5 Hz) record 
of a subject’s head motion while sitting still and 
looking straight ahead. The hybrid parameter- 
ization sets p = y and a = 0 for both its J and 

disfj 

J, umr portions (i.e., emphasizing horizontal and 
vertical image displacement errors) and a = b to 
equally weight tracking accuracy and noise 
reduction. 

Table 1 lists a variety of metrics describing the 
three parameterizations’ performance at 50 ms 
of prediction for the same 20 s sample 
movement history used in the optimizations. 


The tabulated values are each ratios. The 
denominator of each ratio — the RMS difference 
based on the particular metric between the unde- 
layed displacement input to the sensor and its 
uncompensated (delayed) 

measurement — expresses the error introduced by 
delay. For the noise metric, the sensor displace- 
ment input is set to zero. The numerator is the 
RMS difference for the particular metric between 
the displacement input to the sensor and the out- 
put of the predictive compensator — a measure of 
the error introduced by imperfect prediction. 
Thus, the smaller the tabulated quality, the better 
a particular predictor’s performance is with 
regard to that specified metric. The twist RMS 
error is defined by Eq. (23). The horizontal 
RMS displacement error and noise arise from 
Eqs. (25) and (26) respectively when a = y = 0; 
the vertical components arise when a - \ 3 = 0. 
Distance (or depth) errors (i.e., /3 = y = 0), per- 
pendicular to the plane of the HMD, were omit- 
ted from the table because of their minimal 
impact on the rendered VE in the discrimination 
experiment. 

From Table 1, the horizontal displacement errors 
appear to be 10% to 15% of the vertical, imply- 
ing that horizontal tracking is better for any of 
the predictor parameterizations. A more likely 
explanation for this imbalance is that the side-to- 
side motion record used to parameterize the pre- 
dictors and to calculate these scores contributes 
mainly to the horizontal metric’s denominator, 
and therefore dominates it but not the vertical. 
Conversely, the horizontal and vertical noise 
scores only use the record of the subject sitting 
still (rather than moving side-to-side) from which 
essentially all voluntary activity was removed by 
high pass pre-filtering, making their denomina- 
tors nearly identical for J defauh . In comparing the 
performance of the three predictor parameter- 
izations, the vertical displacement error and 
horizontal noise offer no specific insight. Verti- 
cal noise suggests that the J hvh design is best, 
while the horizontal displacement error shows 
that the and J M parameterizations are 
equally good. Interestingly, the twist error 
metric indicates that J is best, but that J is 

nwv/ fjyi) 

worse than even the default parameterization. 
However, it is important to caution that the 
observations in Table 1 are calculated from the 
same specific (i.e., side-to-side or stationary) 
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motion records used to form the 
parameterizations. 

Figure 2 presents time domain tracking excerpts 
for the three predictor parameterizations from 
the optimization data sets. These plots show 
horizontal and vertical displacements arising 
from head rotations per Eq. (24). The horizon- 
tal trace is dominated by 0.67 Hz side-to-side 
volitional head motion. The same 0.67 Hz 
rhythm is not obvious in the vertical plots. The 
vertical plots appear more oscillatory in the 5 Hz 
range, which, consistent with the performance 
metrics’ scores in Table 1, may simply be due to 
the absence of the larger scale horizontal motion. 
Examination of the magnified plots shows that 
none of the three predictors track the 5 Hz input 
component particularly well — all three outputs 
show 90 ° or more of phase lag. In fact, the 
hybrid predictor output remains in phase with 
the delayed measurement, evidently not exhib- 
iting the other designs' overshoot simply 
because these higher frequency components are 
not being predicted. 

Power spectral density plots in figure 3 summa- 
rize the frequency domain attributes of the three 
parameterized predictors averaged over the entire 


20 s length of the side-to-side motion 
optimization data set for the calculated 
horizontal and vertical components. In general, 
all predictors match the amplitude of the actual 
input in the range of voluntary head motion up 
to -1.5 Hz. The horizontal component has 
greater power density in this region again 
because the motion is predominantly side-to- 
side. The vertical component spectrum shows a 
prominent bulge in the vicinity of 5 Hz — the 
oscillatory activity associated with head and body 
tremor — that is not discernible in the horizontal 
plot. In the horizontal plots, beginning at 
-1.5 Hz, the predictors' outputs initially all rise 
together above the actual input, eventually 
spreading to between 20 and 40 dB higher, 
indicative of predictor action at higher frequen- 
cies. This behavior is seen in the vertical 
spectrum, but only for the default and twist 
optimized predictors. The hybrid predictor 
follows the input spectrum magnitude very 
closely (up only -4 dB at the 5 Hz bump), con- 
firming the lack of prediction at higher frequen- 
cies noted in the vertical plots of figure 2. The 
comparably heightened noise gain for the 
default and twist predictors' spectra but not the 
hybrid's (which had essentially unmagnified 
vertical noise) concurs with the noise values 


Metric 

Twist Error 

Displ Error 

Displ Error 

Noise 

Noise 

Component 

— 

Horizontal 

Vertical 

Horizontal 

Vertical 

Input Data 

Raw 

Raw 

Raw 

High-Pass 

High-Pass 

^ default 

0.181 

0.161 

1.047 

2.250 

2.251 

J 

neisi 

0.135 

0.104 

0.934 

2.703 

1.993 

^ hyb 

0.457 

0.108 

1.041 

2.449 

1.043 


Table 1. Performance metrics for default, twist, and hybrid optimized parameterization with 50 ms prediction. 
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iglire Predict ° r performance for 50 ms of compensation for horizontal displacement (top) and 
vertical displacement (bottom) components. Note the different displacement scales for all plots. 
(Left) Default parameterization with actual and measured (delayed) displacements converted from 

head rotation. (Right) Regions magnified from corresponding boxes in left plots also include twist 
and hybrid predictor output. 


reported in Table 1, and thereby supports use of 
the hybrid design to prevent magnification of 
high frequency vertical noise components. 

It is important to note that while the time and fre- 
quency domain plots and performance metrics 
represent data for 50 ms of predictive compen- 
sation, results are similar for the longer 67, 83, 
and 100 ms latencies covered in the 
discrimination experiments. 



Figure 3. Predictor power spectra at 50 ms look- 
ahead. Vertical (left) and horizontal dis- 
placement (right). 


Experiment 

VE System Hardware and Software 

The experiment VE and KF predictor software 
ran on an SGI Onyx workstation with four 
R4400 CPUs and dual-pipeline RealityEngine-2 
graphics. The subjects viewed the VE in a 
Virtual Research V8 HMD. The position and 
orientation of subjects' head as well as those of a 
visually presented target object were measured 
by separate Polhemus FasTrak instruments (i.e., 
control boxes), each with a single receiver and 
single transmitter, and each interfaced to its own 
Onyx ASO 1 15.2 KBaud serial port. A variety 
of software techniques, including the streaming 
of sensor input to its own stand-alone software 
process that then transfers data to other simula- 
tion processes on the SGI computer via shared 
memory (Jacoby, Adelstein, & Ellis, 1996), 
enable us to produce fast VEs with low latency, 
high update rates, and reduced temporal 
variability. The sensor-to-display “internal” 
latency for the VE used in our experiment was 
measured by the methods of Jacoby et al. (1996) 
to be 35±5 ms (mean ± stdev) for Cartesian dis- 
placement at a steady frame rate of 60 Hz. 
Concurrently reported quaternion rotations 
averaged 5 ms less (Adelstein, Johnston, & Ellis, 
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1995). A number of additional software and 
hardware procedures that further reduce transla- 
tion and rotation latencies respectively to 30±5 
and 25±5 ms or better were not invoked because 
of potential degradation to frame rate uniform- 
ity. This low internal VE system latency and 
uniform frame rate make possible the controlled 
addition of the time delays required for our 
experimental study. 

The VE for the experiments consisted solely of a 
virtual faceted sphere (i.e., target) in a dark, 
empty space, lit as described by Ellis, Young, 
Adelstein, and Ehrlich (1999a). Subjects were 
seated with the HMD’s FasTrak receiver 0.4 m 
below the FasTrak transmitter. The virtual 
sphere, whose position in the VE was governed 
by an immobile second FasTrak receiver, was 
centered 0.8 m in front of the HMD. Ideally, 
with perfect measurement in the absence of any 
delay, the image of the sphere should move on 
the HMD LCD panels in a manner such that it 
appears to the observer to be fixed in space. In 
the presence of the inevitable delays, the virtual 
sphere will not be locked perfectly in space and 
may appear to move about its ideal fixed 
location. 

Software embodying the estimation and predic- 
tion procedures was implemented as a separate 
process interposed between the sensor and VE 
processes on the SGI workstation. The predictor 
process receives raw data from the sensor process 
in one shared memory location and deposits the 
compensated results into another location. A 
separate shared memory process serves to revise 
predictive compensator parameters in real time. 
The predictors are coded in C and, at present, use 
Matlab's C/C++ Library for computing the 
matrix exponential function of Eq. (3). Pre- 
dictor parameterization sets are typically 
developed beforehand in off-line optimizations 
coded in the Matlab language. 

The multi-processing, multi-processor architec- 
ture of our VE system allowed the predictor to 
run without degradation to the other processes 
during our experiments. Predictor computation 
cycles (rotational and translational combined) 
rarely (<0.05%) exceeded the 8.3 ms window 
required to maintain synchronization with the 
120 Hz FasTrak sampling frequency. 


Discrimination Experiment Protocol 

The ultimate goal of our parameterization effort 
is to produce predictors that remove VE system 
while at the same time not introduce compensa- 
tion artifacts. These experiments aim to study 
user awareness of any artifacts due to the pres- 
ence of imperfect predictive compensation. Our 
experimental approach for determining the 
effect on user perception of different compen- 
sator parameterizations is derived from a 
technique to assess subjective detectability ot 
changes in VE latency (Ellis, Young, Adelstein, 

& Ehrlich, 1999a, 1999b). 

The experimental procedure is based on the fol- 
lowing two alternative forced choice protocol. 

The seated subjects were required to yaw their 
heads from side-to-side in time to a 80 beat/min 
metronome (0.67 Hz or 1 ,5 s per full back-and- 
forth cycle) while maintaining the virtual sphere 
in view. Using any perceivable quality in the 
appearance of the virtual sphere as they moved 
their heads, subjects were asked to judge whether 
sequentially presented VE conditions were the 
same or different and entered their automatically 
logged response through a hand-held push- 
button device. The VE could be running either 
Condition A, at the baseline 35 ms displacement 
latency without prediction, or Condition B, with 
artificial latency added to the baseline that was 
then matched by the predictor's compensation 
interval. 

Each of six latency values (16.7 to 100 ms in 
16.7 ms steps) was blocked into its own ran- 
domly ordered set of 20 judgments such that 
each of the four possible A-B condition pairings 
was repeated five times. The order of blocks of 
individual latencies was also randomized. Pre- 
dictor type was blocked so that subjects com- 
pleted all tests with one predictor before 
proceeding to the next. The six possible pres- 
entation orders for the three predictors (standard, 
twist, and hybrid) were balanced between the 1 2 
subjects. Each subject’s proportion of correct 
discriminations for a particular condition was 
calculated from its set of 20 responses. The 
subjects, who were either lab members or paid 
naive recruits, all had normal or corrected to 
normal vision and no other known impairments. 
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Discrimination Results 

Figure 4 shows the percentage of correct dis- 
criminations between the compensated delay and 
minimal delay conditions averaged across all 
twelve subjects. A 50% correct response propor- 
tion would be expected if subjects were guessing 
randomly at the balanced presentation of stimu- 
lus pairs. Discriminability of any of the predic- 
tors grows monotonically with the number of 
steps of added latency. The separations of the 
curves and standard error bars for in figure 4, 
calculated from the binomial distribution for the 
proportional data, suggests that for compensated 
latencies >33 ms the hybrid predictor's pres- 
ence is less discriminable than the other designs. 

A three-way (latency X predictor type X predic- 
tor order) ANOVA was carried out on the pro- 
portional responses following the arcsine square 
root transformation to convert the data to a 
normal distribution (Sachs, 1984, p. 339). The 
main effects of added latency (F = 56.347; df = 
5,30; p < .001) and predictor type (F = 1 1 .239; 
df= 2,12; p < .002) on the proportion of correct 
responses were significant. Predictor order was 
not significant. Neither were any of the interac- 
tions between the main factors. 



Figure 4 Percent correct discrimination averaged 
across all 12 subjects (mean ± std error) as a 
function of predictor type and latency added to 
the 35 ms VE system minimum. 


both the transient VE image drifts commonly 
associated with system latency and the overshoots 
of prediction. Head yaw motion was chosen 
because it is a principal component of gaze 
direction — the action of visually locating objects 
in both real and virtual environments. Further- 
more, because head rotation rather than transla- 
tion by the body is generally expected to cause 
larger shifts of the visual scene (Zikan et al., 

1995), VE images will be more sensitive to 
response imperfections in the rotational 
components of spatial motion. 

In the experiment, artificial latency was deliber- 
ately added and then compensated back toward 
the VE system baseline delay. This avoided pre- 
diction all the way down to zero absolute latency 
for which we could not then produce a predic- 
tion-free VE control condition. It also enabled 
us in a separate study (Jung, Adelstein, & Ellis, 
2000) to compare the subjective effects specifi- 
cally of the artifacts introduced by prediction 
against the degradation caused by uncompen- 
sated, artificially added VE latencies. 

The experiment results show the hybrid predictor 
parameterization to have better performance 
(lower discriminability) than the two other 
designs tested, and that this improvement appears 
to increase as the amount of additional compen- 
sated latency rises. The twist and default pre- 
dictors did not exhibit significant differences in 
discriminability. While the twist and default pre- 
dictors' parameterizations were based on the 
same cost function, this result is still somewhat 
surprising since the default parameter quantities 
(Azuma & Bishop, 1994) were developed for 
completely different VE system components and 
input human motion. It is noteworthy that all the 
predictor designs still remained above 50 percent 
discriminability (including the standard error 
ranges) — the level associated with random 
guessing. If the subjects did no better than ran- 
dom guessing this would have implied that any 
artifacts introduced by prediction were, on 
average, not perceptible to the subjects. 


Discussion 

The experiment tested human performance in a 
specific, stereotyped head motion meant to elicit 


The better performance of the hybrid param- 
eterization is attributable to its ability to predict 
in the horizontal direction while not predict in 
the vertical. Thus, while the slow (and therefore 
less overshoot prone) side-to-side motion was 
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acted upon, amplification (and consequent 
increased visibility) of the higher frequency 
noise components dominant in the vertical was 
avoided. The behavior of this parameterization 
was dictated by a cost function that weighted 
equally the displacement magnitude of higher- 
frequency noise (such as tremor) and errors 
from tracking low-frequency volitional 
inputs — all as projected into the plane of the 
HMD. None of the cost criteria examined 
explicitly weighted time lag. Also, the hybrid 
cost function as weighted did not itself differen- 
tiate between horizontal and vertical response: 
the difference arose because of the directional 
anisotropy of the motion history upon which the 
optimization was carried out. As such, the 
resulting predictor parameters may only be 
suited for the specific type of motion from which 
they were optimized. 

A number of metrics based on the cost criteria 
for the different optimization schemes were 
examined as indicators of predictor perform- 
ance. Two of the four metrics related to the 
hybrid optimization indicated that the hybrid 
predictor would have the lowest vertical noise 
amplification and tie for best tracking accuracy. 
The other two of the four metrics demonstrated 
no preferred parameterization. The twist error 
metric picked the twist design— more or less as 


expected since this metric was the cost function 
for optimizing twist. Interestingly, the same cri- 
terion also suggested that the hybrid design 
would fare worse than the default parameteriza- 
tion. It might be claimed that this is because the 
simple twist cost function is unable to capture all 
the nuances that the directional metrics can. At 
issue, however, is whether these metrics are ulti- 
mately suitable only for scoring optimizations 
obtained with the same cost function as the 
metric. This question warrants a more detailed 
analysis of these performance metrics, using data 
from many different motion records and from a 
variety of subjects rather than the same single 
data set used in the optimizations for this study. 

Though the optimization cost criteria introduced 
in the course of this work may have general util- 
ity, the specific numerical parameterizations were 
developed for the exact same subject movements 
used in the experiment. Hence, the predictors 
studied might not achieve the same performance 
for other head or body segment motions — recall 
that the hybrid parameterization did not predict 
vertical components. One possible approach to 
this limitation would be to develop banks of 
appropriate predictors whose relative weightings 
could be modulated depending on the type of 
human operator activity. 
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